Image Spam Filtering by Content Obscuring Detection

نویسندگان

  • Battista Biggio
  • Giorgio Fumera
  • Ignazio Pillai
  • Fabio Roli
چکیده

We address the problem of filtering image spam, a rapidly spreading kind of spam in which the text message is embedded into attached images to defeat spam filtering techniques based on the analysis of e-mail’s body text. We propose an approach based on low-level image processing techniques to detect one of the main characterstics of most image spam, namely the use of content obscuring techniques to defeat OCR tools. A preliminary experimental evaluation of our approach is reported on a personal data set of spam images publicly available.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Image spam filtering using textual and visual information

In this paper we focus on the so-called image spam, which consists in embedding the spam message into images attached to e-mails to circumvent statistical techniques based on the analysis of body text of e-mails (like the “bayesian filters”), and in applying content obscuring techniques to such images to make them unreadable by standard OCR systems without compromising human readability. We arg...

متن کامل

Filtering Image Spam with Near-Duplicate Detection

A new trend in email spam is the emergence of image spam. Although current anti-spam technologies are quite successful in filtering text-based spam emails, the new image spams are substantially more difficult to detect, as they employ a variety of image creation and randomization algorithms. Spam image creation algorithms are designed to defeat well-known vision algorithms such as optical chara...

متن کامل

Embedded-Text Detection and Its Application to Anti-Spam Filtering

Embedded-Text Detection and Its Application to Anti-Spam Filtering Ching-Tung Wu Embedded-text in images usually carry important messages about the content. In the past, several algorithms have been proposed to detect text boxes in video frames. Previous work often followed a multi-step framework using a combination of image-analysis and machine-learning techniques. In this work, we propose a u...

متن کامل

A Sobel Edge Detection Algorithm Based System for Analyzing and Classifying Image Based Spam

Early spam mails were only text-based, however spammers have moved to more sophisticated spamming techniques that involve images now generally termed image based spam. In most image-based spam, the entire spam message, which could be sometimes text, is embedded in an image of any format. This type of spam emails creates another dimension to the spam filtering problem scenario. Extracting text f...

متن کامل

Trends in Combating Image Spam E-mails

With the rapid adoption of Internet as an easy way to communicate, the amount of unsolicited e-mails, known as spam e-mails, has been growing rapidly. The major problem of spam e-mails is the loss of productivity and a drain on IT resources. Today, we receive spam more rapidly than the legitimate e-mails. Initially, spam e-mails contained only textual messages which were easily detected by the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007